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ABSTRACT 

This review of research connected to Project Student 
Teacher Achievement Ratio (STAR) is a summary of the project's 
ongoing work in the form of a letter from researchers to a 
hypothetical colleague. Project STAR has investigated the effect on 
student achievement and development of small classes in the primary 
grades (K through 3). A research consortium of four universities and 
the State Education Agency was formed to conduct and monitor the 
study with the aid of an advisory panel and outside consultants. 
Forty-two districts and 79 schools (later 76) participated. Three 
major subsidiary studies have built on STAR; (1) the Lasting Benefits 
study that tracks students; (2) Project Challenge, a policy 
application of small classes in 16 low-performing school systems; and 
(3) the Grade 4 (and eventually Grade 8) participation study. The 
database established for STAR, which now includes about 9,000 
students randomly assigned to small-class, regular-class, and 
regular-class v;ith aide conditions, is serving as the basis for other 
research, including examinations of achievement, racial differences, 
the interaction of school size and class size, and other state 
initiatives. Three appendixes present tables of research design, 
analysis of variance for cogni t i ve outcomes , and rankings of 
Challenge districts. (Contains 8 tables and 33 references.) (SLD) 
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Precis 



Using an epistolary format, the researchers explain to a doubting peer some of the 
continuing work on the Project STAR data, and on data arising from added class-size studies 
that build upon the STAR database. The research, begun i-n 1985, is still going on and being 
extended. Reanalyses of the original data are providing added questions and answers 
regarding class sizes and pupil learning as measured by various tests. 
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The Multiple Benefits of Class-Size Research: A Review of 
star's Legacy. Subsidiary and Ancillary Studies 

Although Project STAR has been around for nearly a decade, it is only now reaching the 
"maturity" when it can be engaged to answer a wide variety of questions. Each year, 
researchers use the STAR design and data to build a knowledge base on class-size issues. 
Tentative plans are to follow a large STAR cohort through grade 12. 

Authors will briefly review STAR, Tennessee’s longitudinal randomized experiment 
(1985-90). Tliey will discuss three major subsidiary studies that build upon STAR (Lasting 
Benefits Study or LBS that is tracking STAR students; Project Challenge that is a policy 
application of small classes in 16 low-performing Tennessee systems; and the Grade-Four - and 
Grade-Eight if it is ready - Participation Study). They will describe how the large, carefully- 
developed STAR database offers a "mother lode" of opportunity to mine rich ancillary 
questions relating especially to early elementary education. Ancillary studies use the STAR 
database, but are further removed than the subsidiary studies. 

Besides the reviews of design and study results, researchers discuss how a database 
established specifically for one purpose supports a stream of research deriving from the 
primary research question. Two major criticisms of STAR encouraged the researchers to review 
ihe study, to make adjustments and re-analyze some data. The revised results substantiated 
and strengthened the original analyses, nearly doubling some Effect Sizes (ES). 

The STAR/ LBS database includes approximately 9,0(X) pupils randomly assigned in K- 
3 to one of three conditions: Small class (S) of about 1:15; Regular class (R) of about 1:25; and a 
Regular class with a full-time Aide (RA). Districts in STAR provided test-score results from 21 
"comparison" schools of similar demographics to STAR schools for grades K-3. LBS has 
followed approximately 4,5(X> pupils from STAR into grade 8 (1994). Data include test scores on 
the Stanford Achievement Tests (SAT) and Tennessee Comprehensive Assessment Program 
(TCAP), responses to questionnaires and interviews, and demographics. 

Employing purposive subsamples from STAR data, from comparison schools and from 
LBS data, researchers have used descriptive, multivariate and univariate analyses to answer 
to such questions as: 1 ) What is the test-score "value" of K to pupils (by race) in grades 1, 2 and 
3?; 2) Is small-class placement a remedial strategy for test-score gaps tetween white and non- 
white pupils? 3) How do class size and school size interact in early grades?; 4) How does early 
class-size placement influence pupil participation in and identifleation with school?; 5) Does 
small-class placement a) reduce retention in grade, and/or b) help achievement of pupils once 
retained?; 6) How long do early test-score benefits remain for pupils from the (S) condition and 
how much do they fade (LBS)?; How do STAR/LBS results apply to Project Challenge? How 
does the Tennessee Value-Added Assessment System (TV A AS) support Project Challenge? 
Researchers are interested in group discussions and the potential that participants can identify 
new questions for analysis using the databases. Researchers hope to engage the audience in 
considering ways to make small classes (1:15) more palatable to policy and funding agencies. 
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The Omphalos Research Center 
April T 1984 



Dr. Ima Snipe 
Skeptos Institute 
Showme, Missouri 

Dear Dr. Snipe: 

Although this may sound insincere, we actually thank you for belatedly raising questions about the 
line of research that we have been exploring and rep>orting for the past eight years. Belatedly, we believe, 
because your questions go back to the initial analyses that we reported beginning more than five years 
ago. We have been continuing our work and have moved well beyond some of your concerns. Had you 
contacted us directly for clarification (rather than making tenuous assertions and unsupported 
generalizations in non-refereed publications and "in-house" documents) we might have answered your 
questions more exactly and in a timely fashion. 

On the other hand, careful critique keeps researchers honest and humble. Without your skepticism 
and questioning, we probably would not have returned to the original, excellent database — and that 
would have been an error. The tone of your questioning served only to fan our ardor and set a "we ll 
show’ ’em" tone that ensured more thorough analyses. 

Yet, in spite of the elation of "completing" a study there always are twinges of underlying doubt in 
good research: Did we do everything right? So, while some people may tire of their seemingly endless 
repetitions, peer review and critique have clear benefits. Not only does peer review help "keep us 
honest," it brings to the problem a new set of eyes; it may ask questions beyond those that we were 
satisfied to answer; it may interject ideas based upon a different reading and interpretation of the research 
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and literature supporting the original line of inquiry; it may identify gaps in the rigor or breadth of the 
analyses; and may make connections with other studies and related fields that we skipped over, etc. 

In this serious, vein, then, we have considered carefully the concerns that you raised. Your 
questions started us on the task of re-analyzing the initial database in slightly different ways, and we 
reformulated some of the initial questions in slightly different ways. Perhaps we get ahead of ourselves; 
let’s return to the original purpose of the initial study and start, there to take another journey through the 
database and analyses. In constructing this revisitation, I have followed an outline format. For ease in 
following the narrative, here is the outline format. 

I. Some History. 

A. Background 

B. Original Study Purpose 

C. Synopsis of Methods (Appendix A) 

D. Synopsis of Results (Appendix B) 

II. The Derivative Studies (LBS and Challenge). 

III. Some Ancillary Studies (class size/school size, test-score value of K, retention in grade, test-score 
"gap" reduction, homogeneous vs. heterogeneous grouping and achievement, etc.). 

IV. Questions and Answers Raised by Your Critique; References and Bibliography . 

L Some History 

A. Background 

In 1985 the Tennessee legislature commissioned Project STAR (Student Teacher Achievement Ratio) 
to try to get son>e answers to the basic question, "What is the effect on pupil achievement (and 
development) of small classes (e.g., about 1:15) in early primary grades (K-3)?" A research consortium of 
four major Tennessee universities and the State Education Agency (SEA) was formed to conduct and 
monitor the study aided by an advisory panel and outside consultant help. All Tennessee districts 
(n=140) were invited to participate, resulting in 42 districts and 79 schools (later reduced to 76). Selected 
districts did not differ from others in the state except slightly in size (the three Jargest systems were in the 



final sample) and fairly represented urban, rural, inner-city and suburban areas. Districts agreed to STAR 
procedures and to remain in the study for four years. Researchers manipulated only one variable, class 
size, and were to assure that no pupil received any diminution of services by being in STAR. 

B. Original Study Purpose 

The study was to answer the basic research question (see lA above) and to try to get a definitive 
answer to the class-size issue and debate (e.g.. Glass & Smith, 1978; Glass et al., 1982; Cahen et al., 1983; 
Education Research Service or ERS, 1978 & 1980). The policy mi.kers in Tennessee wanted this 
information as a basis for setting state regulations on class sizes (Tomlinson, 1988, 1990). 

C. Synopsis of Methods 

Researchers assigned pupils in kindergarten (K) in the participating schools in 1985 to one of three 
class-size conditions: a Small class (S) of about n=15 with a range of 13-17; a Regular class (R) of about 24 
with a range of 22-26, and a Regular class with a full time aide (RA). Once designated as S, R, or RA a 
class remained so designated for the duration of the study and pupils remained in the class conditions. 
Pupil mobility was handled by random replacement. Since Tennessee did not have mandatory K during 
STAR, the increase in students to STAR in grade one required researchers to establish some additional 
classes. Use of an "in-school design" where each participating school had at least one class of each 
condition (S,R,RA) helped control for building-level and district-level variables. In all analyses there 
were approximately 100 classes of each of the three conditions. A multivariate analysis of variance 
(M ANOV A) was used (Finn & Bock, 1985). The basic design has been reported elsewhere (e.g., Finn & 
Achilles, 1990) and is in most papers and documents generated for the study (e.g.. Word et al., 1990). 
Appendix A is a summary of design and analysis steps and is reproduced from Achilles et al., (1993, pp. 
617-618). 

D. Synopsis of Primary Findings. 

The rather straight forward analysis was of the class-size means (although data were collected on 
pupils on both norm-referenced tests or NRT and criterion-referenced tests or CRT, as well as on other 
measures, data were analyzed as class means as this was a study of class-size effects) of the three class- 
size conditions (S,R,RA). Results consistently (grades K-3) showed that pupils in the (S) condition 
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outp>erfonned pupils in other conditions. Generally the (R) and (RA) results were quite similar. These 
results have been reported in detail elsewhere (e.g., Finn Achilles, 1990; Word et al., 1990). Appendix B 
is a summary of results reproduced from Achilles et al. (1993, p. 619). 

Small classes in this study outperformed (R) and (RA) classes in all locations and on all cognitive 
measures, generally at or beyond p < .001. This translated into effect sizes (ES) ranging from .25 to .44, 
depending upon the comparisons. These results came from a carefully designed longitudinal (K-3) 
experiment with random assignment and the manipulation of just one variable, class size. Since the 
results were so clear and consistent (note Appendix B), and the answer to the basic question so 
unambiguous, researchers essentially left the STAR database and turned their attention elsewhere. 

il. The Derivative Studies (LBS and Challenge) 

After answering the class-size issue posed in STAR'S enabling legislation, researchers sought and 
obtained modest funding to follow as many STAR pupils as possible as they moved out of the 
experimental conditions and into "regular" classes at grade 4 and beyond. This was called the Lasting 
Benefits Study (LBS). By 1993-94 many STAR pupils were in grade 8. Data analyses for grades 4-6 show 
continuing achievement benefits (p < .01 and ES from .15-. 25) for pupils who had been in (S) classes even 
three years after their return to ’regt^lar" classes. (Analyses are in process for grade 7.) 

Starting in 1989 leaders in Tennessee made funds available in 17 of the state’s poor counties for 
broad-scale class-size reduction (Project Challenge). Using only the gross measure of ranking, on average 
these 17 systems have increased their rank 12 places in reading scores and 26 places in math scores 
(grade-two data) between 1989-90 and 1991-92 among the state's 138 districts. (See Appendix C for a 
summary.) 

Some researchers recognized that STAR and LBS were generating a useful database and so they 
framed other questions that the database could help answer. We are just getting started with this series ot 
studies which we are calling the Ancillary Studies. Even though the database is again attracting modest 
attention, researchers had not decided to return to the original STAR analyses until there were some 
questions raised about STAR in other writings (e.g., Tomlinson, 1988 &; Mitchell et al., 1989). Then, of 
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course, there are the concerns that you have raised and some nagging notions, reflected also in Robinson 
(1990) that we did no learn enough in 3TAR. But first, just a note on the Ancillary Studies. 

Ill, The Ancillary Studies 

The mass of data collected for STAR and LBS surely could help in exploring additional questions of 
concern to educators. Not often does one have available a longitudinal database of over 10,000 pupils, 
many of whom were at one time randomly assigned, etc. At the outset of STAR the researchers selected a 
set of "comparison” schools from districts that had a school in STAR. Each comparison school was to be 
as similar as possible to the STAR school (demographics). There were no interventions; researchers only 
collected the test data for the parallel classes (K-3) for the STAR years (1985-86 to 1988-89). 

As it moved ahead, for example from grade 1 to grade 2, the STAR cohort encountered those pupils 
who had been retained in grade 2 the previous year. Each year, all new students in STAR cohort sites 
(including retainees) were assigned to classes (S,R,RA) at random. Researchers also collected a file of data 
about each pupil, each teacher, each school, etc. In grade 4 (LBS) researchers collected data on student 
participation in school (Finn & Cox, 1992; Finn, 1993) from the available STAR pupils. The participation 
study became the first of the formal "ancillary" studies. 

Given the data and the opportunity, researchers have recently initiated the series of Ancillary 
Studies. The following are very brief summaries of some Ancillary Studies to date. 

A. Participation . Students from (S) conditions were more actively involved (participating) in grade 4 
than were students from (R) or (R A) conditions (p < .05) (Finn & Cox, 1992). We have repeated this in 
grade 8 (Finn et al., in process) and hope to repeat the study again in high school. 

B. Retention in grade . Small classes do not help students who have been previously retained. 
Retainees do poorly (academically) in all class conditions, including the (S) condition. Initial placement in 
(S) seems to prevent or deter retainment; (S) is not a treatment for remediation later (Harvey, 1993). 

C. "Gap reduction" between White and minority pupils . As in the case of retention, th' S) seems to 
prevent a large gap opening (K-3) between scores of White and minority pupils. The (S) condition helps 
minority pupils proportionately more than it helps White pupils. Once the test-score gapoi:>ens, (S) is not 



an effective treatment or a remedy (Bingham, 1993). However, (S) placement seems to help keep this gap 
from opening. 

D. Homogeneous or heterogeneous? Using the randomly-assigned STAR (R) and the comparison 
school (non-random assigned) students as con-iparisons. test-score achievement results favor the 
randomly assigned pupils (ANOVA, ANCOVA) increasi”igly from K to grade 3 (Zaharias, 1993). 

E. Class size/school size . Numerous studies (e.g., kounin &Gump, 1944; Fowler & Walberg, 1991) 
have shown that students in large schools get lowe • tc^t scores and have lower participation rates than do 
students in small schools. Nye (in process) is exploring if small class placement tends to ameliorate the 
school-size effect. Initial results suggest that this is the case. 

F. Other studies are planned in this series, including one about discipline, one on school effects, and 
an entire series on the teacher aide issue. A study of the 'Test score value of kindergarten in later years: 
Grades 1, 2 and 3" has shown substantial (ES .40 to .50 or more) benefits to pupils, especially in (S) classes 
(Nye, Achilles, & Bain, 1994). Researchers hope to explore the range of questions inherent in "Is the (S; 
treatment a preventive or a remedial event?" 

G. By Grade 7 pupil data were being entered into the state monitor system, so we*re working on 
studies of behavior (discipline). 

H. Tennessee has established an important test-data system, the Tennessee Value-Added Assessment 
System (TVAAS). We are comparing our LBS and Challenge results with this database. So far, all 
analyses are confirmatory. 

I. A re-analysis of the teacher aide question is planned, as is a study of the effects of class-size 
reduction on teacher classroom behavior. Data for this were actually collected (pretest or pre- 
intervention data) in 1985-86, with post-test data collected in 1986-87 for Grade 1 . 

IV. Questions and Answers Raised by Your Critique 
This background is important as a base for understanding the processes we followed to explore 
answers to the questions and issues that you raised. As 1 understand it, a general issue was that although 
STAR showed the benefits of (S) for pupil achievement, you were disappointed that STAR was a 
"mundane" study in that there were no steps to explain such things as why the (S) pupils may do better. 
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or to discuss what teachers may have done differently. Indeed, it seems inconsequential to advocate for 
(S) since other interventions (e.g.. Success for All; Slavin et al., 1990; Madden ct al., 1993) get considerably 
larger effect sizes (ES), and since there were n 3 additional, probing analyses to try to explain why we got 
what we got. (We attribute it to class si^e - period!) Indeed, on some of these points you are correct: we 
answered the question put before us and, with little ado, moved to the Derivative and Ancillary Studies. 
Some points that you raise cannot easily be resolved in a letter with a brief "yes" or *’no“; they need a 
symposium-like atmosphere for discussion and they beg for lively interaction with the *’on-iine*’ STAR 
and LBS data. Only then, guided by both critics and supporters of the research might we get some 
resolution. Nevertheless, since you have raised the questions and issues, let me try to address them. 

1. You point out that in STAR most gains were in K and 1, and that in later years (grades 2 and 3) 
pupils seem only to hold the gains from the early years. On the surface, this seems substantially correct, 
but our year-to-year analyses (cross-sectional) did show annual differences. 

The "learned more each year" issue is not easily answered, and we continue to search through the 
data for clues. We do know that each year on the CRT (new objectives) the (S) students did better than 
those in (R) or (RA). We do know that the (S) pupils achieved what they did in reading and math m 
c onsiderably less time per day than the other pupils. Evertson and Folger (1989) noted: 

Teachers in the small classes devoted an average of an hour (64 min.) to reading instruction, while 
teachers in regular classes spent an hour and twenty-four minutes (84 min.). This might be 
expected considering that teachers in regular classes were instructing 1 /3 more students. In fact, 
the time spent in the small classes reflects an increase of time per individual pupil of nearly a 
minute, (p. 7) 

Since (S) pupils learned more in less time, we presume that they did not stop learning in the time 
saved. The math here is impress! /e: 20 min/day x 150 days = 3(XX) minutes or 50 hours/year. They 
learned more, we think. Tests in later years show (S) students ahead of (R) and (RA) students in all 
subjects tested (not just reading and math). 

On the topic of additiveness, 1 believe that Mitchell et al. (1989) and I would agree (in theory) that 
there should be a cumulative effect (p. 38), so the lack of an increase in effect size in the original analyses 
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is disturbing. Avenues that we are exploring relate to retention in grade, special education placement 
and "passing" scores. For example, a smaller percentage of pupils is retained in the (S) classes than in (R) 
and the range of scores of promoted students is larger in (S) than in (R), 19 to 8. (See Table 1.) This means 
that as the (S) class moved along as a cohort it accumulated lower scoring pupils than did the (R) class. 
(Also, it had higher-scoring pupils due to the class-size impact . The ratio may be key.) As STAR cohorts 
moved through the grades (K-1-2-3) they would "pick up" pupils retained in grade the previous years. 
Again, the ratio is important. One major issue here, however, is a minor design flaw. As the years of the 
study went by, the line between (S) and (R) b xame blurred. (More on this later.) 



11 



8 



Table 1. Range of scores for promotion/rcicntion by class type for STAR, Kindergarten to Grade 1 . 
(Scores on SESAT.) 





s 


R 


RA 




DIFFERENCES 

S-RA 


R-RA 


Promote 


441 


435 


436 


6 


5 


-1 


Retain 


422 


427 


421 


-5 


1 


6 


Difference 


19 


8 


15 


11 


4 


-7 



Since the original analysis answered the original questions, we probably would not have revisited 
the STAR database if you (and others) had not raised some questions. Our responsibility to the State 
Board and legislature was to research the question in the legislation (which we did). The answer clearly 
was that students in (S) classes do' statistically (and educationally) better than students in (R) and (RA) 
classes in all locations. When people began to ask about the size of the difference (new questions) we 
have returned to the data. It is very clear that the original reported results wore quite conservative . 
However, discussion specifically about the 1:15 ratio spurred us to run for public consumption the 
frequency distributions of class sizes in the S,R,RA conditions over the four years (K,l,2,3) of the study. 
Table 2 provides the frequency distribution information. Several points are important here. 



Tabic 2 about here 



a. A class was designated S,R,RA based upon the K distribution, if a (S) class grew ’’out of range," 
we still analyzed it as (S) as the pupils were in (S) in K. A class designated (R) was still treated as 

(R) even though it may have shrunk **out of range,” a particular possibility with large levels of 
retention in grade, and a possible explanation for a diminution of differences between scores of 

(S) and (R) or (RA) in later grades. These analyses will mean that we’ll lose some classes; but a 
class (n) of 50 or so should still be enough for this analysis. 

b. Most in-range (12-17) small classes have 16 or 17 pupils, toward the ”large” end of (S), and larger 
than 1:15 (except grade 2), the preponderance of in-range (22-26) regular classes have 22-23 pupils 
and are toward the '’small" end of the (R) range (except grade 3). Thus, many (S) classes in the 
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original analyses had more than 15 pupils and manv iR) classes had levver pupils than 24. 

Classes were ’’out of range," and the debate still spoke of 1:15. 

c. The distribution of "out of range" classes (n=18-21 ) seems to favor (R) or (RA), but this remains to 
be tested. Even though we have "lost" some classes, early analyses show that the results are 
significant and show larger (S) benefits than found in the original analyses. Tables 3 and 4 show 
basic Pearson (r) for class size with average scores for each class size using the composite of (S) 
and (R) (Table 3) and just the RA classes (Tabic 4). Column A in Tables 3 and 4 shows the ‘'legal'* 
correlation using all classes ranging in size from 12-27; column B (Table 3 only) is based on in- 
range classes and suggests that removal of out-of-range classes may have some merit. 

d. Note that most of the "Most Effective" teachers based on gain scores in Grades 3 were in (S) (Table 
5), and for this analysis, (S) was 13-17; some were above 1 :15. 



Tables 3,4, and 5 about here 



2. Your second point is that you consider the finding of class size effects is, by itself, not very exciting 
- indeed, "mundane" is your word. Study results do say that small classes by themselves lead to 
improved achievement and to get that achievement, teachers needn't do anything special. Finn's notion 
of "participation" may be important here. The work of Slavin and others (Slavin et al., 1990; Madden et 
al., 1993) shows that if teachers do other things, achievement goes up. Our research suggests that some of 
Slavin'^ et al. ES may well be due only to starting with a base of n=15 or so. In fact, probably one-third to 
one-half of the ES found in the Slavin et al. work may be due only to class size, and that this condition 
needs to be met prior to the benefits of other interventions. 

We should not argue over your use of "mundane." if "mundane" serves your purposes, surely you 
should use that descriptor. We are, however, gratified to find among the poor research (or data) out there 
often called "education research," at least one fairly secure finding supported by a strong design, 
longitudinal analysis, etc. We probably could argue on this for awhile, but - frankly - we find most ot 
the one-shot questionnaire stuff pretty useless; indeed, mundane. In fact, the OERl "Background 
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Document on Proposed Funding* for CSHARS notes: "Ideally, in order for scientific findings to be 
considered 'definitive,' they need to be executed u'ith rigorous techniques, , they need to be sufficiently 
robust to be generalizable beyond the study sample. . . . Because few social science studies meet these 
criteria. . . (p. 14). The rigorous STAR/LBS study at least meets this criterion of 'definitive" — even if it 
is "mundane." Again, we're content with the conservative results of STAR. 

This brings us to the multivariate analysis (M ANOV A) that we conducted (Finn & Bock, 1985) after 
we removed the out-of-range classes. (See Table 2.) The results of this analysis using classes of 13-15 as 
(S) and of 23-27 as (R) (pupils tested, not pupils assigned, so results will still be conservative) appear in 
Table 6. Although the results are statistically significant (usually beyond p < .01 ), the key findings arc 
probably reflected in the reported effect sizes. They range from .33 to .71 and generally show a trend of 
increasing, giades 1 to 3. This analysis shows a greater effect of (S) once the out-of-range classes are 
cleaned out of the analysis and now we have an ES consistently over .50 at nearly all grades. This 
compares favorably with some planned interventions to cause achievement gain -- a question totally 
different from the one addressed by STAR. 



Table 6 about here 



3. You ask why the STAR and LBS studies found continuing benefits for (S) for pupil achievement 
while prior studies had not found such continuing benefits. One reason is that most STAR pupils were in 
(S) for K-3, or at least for several years. Other studies were short term, even "one-shot." 

4. Some of the "puzzling findings" that you note have also been found in other studies of early 
intervention, such as the Head Start evaluations and other studies where there is a "fade" in later grades 
(Weikart, 1989; Zigler, 1992; McMasters, 1991; and other sources). The initial STAR analysis seemed to 
find the same thing. One way to explain it is to consider the students who would be new to STAR each 
year (often retainees) and that in (S) teachers kept lower scoring students moving ahead while in (R) the 
lower scoring students were more likely to be retained (Table 1 ). Another approach is to rc-analyze the 
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STAR data after removing the out-of-range classes and considering the proportion (%) of (S) and (R) 
classes in the top and in the bottom quartiles of classes based upcin test ' cores. 

Table 7 shows what percent of (S) and (R) classes are in the top '.5%, middle 50% and bottom 25% 
of classes based on reading test scores. Notice that consistently the STAR (S) classes are over-represented 
(from 8% to 11%) in the top-scoring classes and under-representi d in the bottom, while the reverse is true 
for (R) and that there isn't much variation in the middle 50% group. 

Table 8 examines the question after removing the out-of-range classes and comparing the percent of 
(S) and (R) classes in the top 25% and bottom 25% as a proportion of the percent of (S) and of (R) classes at 
each grade. The (S) classes are 39% of all K classes, but 51% of the top 25%, or a difference of +12% from 
what might be expected. The constant positive differences for (S) in the top (from +1 1 % to +14%) and 
constant minus differences for (S) in the bottom (-4 to -14) and the opposite for the (R) classes show the 
consistent and generally increasing benefits of the (S) condition even with the retainee and low-scoring 
student phenomena mentioned earlier. This, of course, raises a new set of questions for exploration, such 
as removal of retainees prior to the analyses, etc. 



Tables 7 and 8 about here 



So, we trust that this helps clarify some of your questions. If you (and others) had not raised 
questions - often about the power of the differences that we initially found - we probably would not 
have revisited the analysis. In fact, the benefit of your questions (and, we'll admit, we thought that you 
were stretching things a bit to find fault) is that we have decided to do what we should have considered 
before — some secondary analyses to answer questions that we should have anticipated. A major 
database, collected carefully and at great expense, should help with nagging questions. We hope that this 
letter will give you fodder for future questions. Next time, please contact us directly so that we can all 
benefit from the question asking and problem finding, not just from the problem solving. 

Today's children (Hamburg, 1992; Hodgkinson, 1991 and 1992) are different from yesterday's, and 
their needs are far different. A small-class start in school seems to be an outstanding option for 
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preventing future problems (e.g., Finn, 1993; Bingham, 1993; Harvey, 1993; etc.). Tutoring is a one-to-one 
option, the ultimate in class-size reduction; it is quite expensive compared to 1:15. Yet, given the 
condition of young children entering schools today (e.g., Hodgkinson, 1992) we know of few options that 
provide the positive results of class-size reductions, and we know of few "programs” successful in raising 
scores of pupils that do not start with small groups or classes. 

We can speculate from STAR/LBS results that the use of (S) in early primary grades has the 
potential of being expensive, but of actually offering a path to cost savings: 

a) By reducing the need for grade retention (an expensive and non-producti\ e form of witchcraft 
in the field). 

b) By reducing the pesky gap in achievement between the poor and/or minority pupil and the 
mainstream pupils who have traditionally benefited most form formal schooling. 

c) By establishing work conditions that support the idea of mainstreaming and that help teachers 
to early identification of pupil learning difficulties, thus saving on special-education costs. 

Additionally, we should not exclude the potential for teacher satisfactions gained by successes 
achieved through working with a manageable group size in one classroom. We shall continue to inform 
you of our work as it progresses. 



Sincerely, 



Author N'citcs 



C.M. Achilles is Professor, Educational Leadership at Eastern Michigan University, Ypsilanti, Ml 
48i97. Barbara A. Nye is Director, Center of Excellence for Research in the Basic Skills, Tennessee State 
University, Nashville, TN 37203-3401 (615-963-7238). Dr. J. Zaharias, D. Fulton, and V. Cain are 
researchers at the Center. Among other things the Center is managing the Student-Teacher Achievement 
Ratio (STAR) database, conducting the Lasting Benefits Study (LBS) which is still following a major 
cohort of STAR students who are now in grade 8, providing some evaluation support for Project 
Challenge, which is a general application in 17 poor Tennessee counties of small classes (1:15) in the line 
with STAR findings. 

C.M. Achilles was one of the original principal investigators on STAR (H.P. Bain, j. Folger, j. 
Johnston, F. Bellott, E. Word were the others). He continues to re-analyze the STAR database and to try to 
address new questions through the STAR, LBS and Challenge connections. 

The authors wish to thank all of the persons who have reviewed prior papers that provided the 
base for the present one. Dr. Jeremy Finn assisted with analysis; Drs. Richard Hoopier, G. Bobett and H. 
Bain assisted with analysis and interpretation. Special thanks to the various skeptics who have been 
combined into the mythic, composite Dr. Snipe, to whom this letter is addressed. 
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Table 2. Distribution of STAR classes by grade (K-3) by dci^.gnation S (Small), R (Ragular), and RA 
(Regular and Aide). 
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A = range for (S); B = "out of range"; C = range for both (R) and (RA) classes. 





Table 3. Correlations of class size with reading and math (SAT) scores, K-3, Project STAR (1986-1989) 
using (S) and (R) classes only (A=all S and R) (B is S=13-15; R=23-26). 



Grades 

K 1 2 3 

B A B A B A B 

-.28** -.27** -.24** -.23** -.35** -.23** -.30** 

-.20* * -.26** -.23** -.18** -.30** -.18** -.26** 

Sig: * = .05, ** = .01 



A 

Reading -.19** 

Math -.14* 



Table 4. Correlations of class size with reading and math (SAT) scores, K-3, Project STAR (1986-1989) 
using (RA) classes only ( A=all RA) (B is 23=27 pupils in RA). 



Grades 

K 1 2 L 

A 
.08 
.04 



Sig: * = .05, ** = .01 



Reading 

Math 



A 

-.11 

-.04 



A 

-.05 

.03 



A 

.13 

.08 



Table 5. Interviews of 50 "Most Effective" STAR Teachers (Based on Gain Scores) (Grade 3). 



Of those 50; 
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13-17 (S) 
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full-time 


24 


50 


aides (RA) 


100% 



*22-25 pupils is probably smaller than many regular classes nationwide. 
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Table 6. Means, Standard Deviations, etc. for ANOVAs, grades K-3 for Small (n=13-15) and Regular (n=23-27) classt's, STAR, showing significa 
and effect sizes (ES). 
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Table 7 



Distribution (%) of Small and Regular Classes into Top (25%). Middle (5Q?o) and Bottom (25%) of Class 
Average Scores (Total Reading) Unadjusted for "Out of Rang e" 
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Tables 



Percent of High (Top 25%) and Low (Bottom 25%) Scoring Classes (Reading:) that are Small (S) or Reeular 
(R) by Grade 

K 1 2 2 
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A = Actual Percent 

Dif = Difference (+,-) from Expected distribution (sec "Note"). 

Note: S classes are 39% of classes; R are 31% of the classes (K). 

S classes are 37% of classes; R are 34% of the classes (grade 1 ). 
S classes are 39% of classes; R are 30% of the classes (grade 2). 
S classes are 42% of classes; R are 26% of the classes (grade 3). 
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STAR (1985-19891; LBS 1990-1991. APPENDIX B 
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Rankings of Challenge Districts (n=1 7) of 1 38 TN School Systems Based on 
Grade 2 TCAP Scores (Reading and Math). (State TN ave. rank is 69.) 
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Table 1 



Samples of Studies Derived From and Building[i^n the STAR Initiative Classed as 
"Subsidiary" (directly from STAR), "Ancillary" (building on and using STAR database) 
and "Related" (triggered by STAR results and usually involving STAR researchers) 
Studies. 



Category, title & purpose 



DATE (S) 



A'JTHOR(S)OR 

PUBLICATION 



Subsidiary Studies 

• Lasting Benefits Study 

to follow STAR pupils 

• Project Challenge 

1:15 in 17 poor counties 



1989-Present 

1989-Present 



Nye et al., 1994, 
1993, 1992, 1991 
Nye et al., 1994, 
1993, 1992, 1991 



Ancillary Studies (Use or extend STAR 
data. Some of these are dissertations.) 



•Retention in Grade 

• Achievement Gap 

• Participation in ^hool Grades 5, 7 
•Value of K in Classes of Varying Sizes 

(test scores) 

• School-Size and Class Size Issues 

• Random v. Non-Random Pupil Assignment 

and Achievement 

•Class Size and Discipline in Grades 3, 5, 7 
•Outstanding Teacher Analyses 
(top 10% of STAR teachers) 



1994 

1994 

1990, 1994 
1985-89 



Harvey 
Bingham 
Finn et al., 1989 
Achilles et al., 1994 



1985-89 

1985-89 

1988, 1990,1992 
1985-89 



Nye, in process 
Zaharias, 1993 

In Process 
Bain, Bain et al. 



Related Studies 



•Success Starts Small: Grade 1 in Chapter 1 1993-94 

(1:15, 1:25) Schools 

• Burke County (NO Schools and Their 1991-94 

1:15 Experiment 



Achilles^et a\.Cf?9^ 
SERVE, Achilles, et al. 



*Note: This list is not complete. It provides samples of the types of studies. Not all 
"authors" appear in the references in the exact way that they are listed here. 







30 



